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5h , Abstract 

<^ 1 We consider random arrays indexed by the leaves of an infinitary rooted tree of finite depth, 

with the distribution invariant under the rearrangements that preserve the tree structure. We 
call such arrays hierarchically exchangeable and prove that they satisfy an analogue of de 
Finetti's theorem. We also prove a more general result for arrays indexed by several trees, 
which includes a hierarchical version of the Aldous-Hoover representation. 
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: 1 Introduction 

in ; 

The subject of exchangeability is prevalent in probability theory (see e.g. [2], Chapters 7-9 in [1 1], 
or [3], [4] and [5] for recent overviews and results) and the goal of this paper is to study another 
notion of exchangeability that is motivated by spin glass models and, in particular, by the work of 
Mezard and Parisi on diluted models, [12]. 

We begin by considering an array (X a ) ae w of random variables X a indexed by a E W for 
. £h ' some integer r > 1, whose distribution is invariant under certain rearrangements of the indices. We 

will think of N r as the set of leaves of a rooted tree (see Fig. 1) with the vertex set 

c3 ' 

st{r) =N°UNUN 2 U...UN r , (1) 

where N° = {0}, is the root of the tree and each vertex a = [n\ , . . . , n p ) G W for p < r — 1 has 
children 

an := (ni,.. .,n p ,n) 6 N p+l 
for all «6N. Each vertex a is connected to the root by the path 

— > n\ — > (ni,n2) — > ■ ■ ■ — > («i, . . . ,n p ) = a. 
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Figure 1: Index set N r as the leaves of the infinitary tree srf{r). 



We will denote the set of vertices in this path by 

p(a) = {0,ni,(ni,n 2 ),...,(ni,...,n p )}. (2) 

We will consider rearrangements of N r that preserve the structure of the tree srf [r], in the sense 
that they preserve the parent-child relationship. More specifically, we define by 

aAj3:=|p(a)np(j3)| (3) 

the number of common vertices in the paths from the root to the vertices a and /3, and consider 
the following group of maps on N r , 

H r = [k : N r -> N r | k is an bijection, n{a) A n{fi) = a A j8 for all a,j8 e N r }. (4) 

Any such map can be extended to the entire tree srf (r) in a natural way: let 7r(0) := and 

if 7l((ni,...,n r )) = (mi,...,m r ) then let n({n u . . . ,n p )) := (mi,...,m p ). (5) 

Because of the condition n(a) A tt(/3 ) = a A /3 in (4), this definition does not depend on the 
coordinates n p +i, . . . ,n r , so the extension is well-defined. It is clear that the extension preserves 
the parent-child relationship. For each a & g/(r)\ W, it follows that iz(an) = Ji(a)jz a (n) for 
some bijection % a : N — > N. In other words, the condition it{a) A tt(/3) = Ot A /3 means that we 
can visualize the map % as a recursive procedure, in which children an of the vertex a E N p are 
rearranged among themselves for each a. Note that H\ is simply the group of all permutations of 
N. 

We will say that an array of random variables (X a ) a ^>- taking values in a standard Borel 
space A (i.e. Borel-isomorphic to a Borel subset of a Polish space) is hierarchically exchangeable, 
or H -exchangeable, if 

( X n(a))aeW = ( Xa ) aeW ^ 
2 



for all tz £ H r . Throughout the paper, we will view any array of random variables as a random 
element in the product space, so the equality in distribution is always in the sense of equality of 
the finite dimensional distributions. Because of this, one can replace the condition in (4) that % is a 
bijection by the condition that % is simply an injection, since any injection viewed on finitely many 
elements can be, obviously, extended to a bijection preserving the property n(a) A tt(/3 ) = a A j5. 

The case of r = 1 corresponds to the classical notion of an exchangeable sequence, and in the 
general case of r > 1 we will prove the following analogue of de Finetti's classical theorem. One 
natural example of an //-exchangeable array is given by (recall the notation in (2)) 

x a = o({vp)p €p{a) ), (7) 

where o : [0, l] r+1 — > A is a measurable function, and v a for a £ srf (r) are i.i.d. random variables 
with the uniform distribution on [0, 1]. The reason this array is hierarchically exchangeable is be- 
cause, by the definition of %, the random variables v n i a \ for a £ stfir) are also i.i.d. and uniform 
on [0,1], p(n(a)) = 7t(p(a)) andX K{a) = a((v„^)p €p ^).Wevn]l show the following. 

Theorem 1 Any hierarchically exchangeable array (X a ) ae ^r can be generated in distribution as 
in (7) for some measurable function a. 

This result is not very difficult to prove, and one can give several different arguments. We will 
describe an approach that will be a natural first step toward the general case of processes indexed 
by several trees or, more specifically, by product sets of the form N n x • • • x W e for any integers 
r\ j • • • 5 ?e > 1 • Recalling the definition (4), let us denote 

H n ,...,r e = H n x • • • x H re , (8) 

and for any % = (m, . . . , %i) £ H n ^ n and any a = (oci, . . . , £ty) £ W 1 x • • • x W (: , let us denote 

7i(a) = (ni(ai),...,ne(at)). 

We will say that an array of random variables X a indexed by a £ W l x • • • x W (: and taking values 
in a standard Borel space A is hierarchically exchangeable, or H -exchangeable, if 

(^(«))aeN r ix-xN r < = (^ a )aeN r ix-xN'' ^) 
for all k £ // n ,...,r r Let us denote 

£f(n,..-,r e ) = £/(n) x---X£/(r e ) 

and, for a = {d\ ,...,ag) £ srf{r\ , . . . , r£), denote 

p(a) :=p{ai) x---xp(a e ). 

Then, again, the natural class of //-exchangeable arrays is those of the form 

Xa = a((vp)p €p (a)), (10) 

for some measurable function a : [0, l]( r i+ 1 )+---+(^+ 1 ) — > A and a family of i.i.d. random variables 
Vjg indexed by /3 £ srf{r\ , . . . , rg) with the uniform distribution on [0, 1] . 
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Theorem 2 Any hierarchically exchangeable array (X a ) ae ^r 1 x ... x ^ can be generated in distri- 
bution as in ( 10) for some measurable function o. 

The main example we have in mind is when the array (X aJ ) is indexed by a G N r and i e N, 
in which case (9) becomes 

( X it(cc),p(i)) ae w,ieN = ( X <*j) aeW,ieN 

for all 71 EH r and all bijections p : N — > N. Theorem 2 implies that any such array can be generated 
in distribution as 

X a ,i = <7((v j8 ) j8ep ( a ),(vJ 3 ) i3e/ ,( a )), (12) 

where a : [0, l] 2 ( r+1 ) — > H. is a measurable function and all v a and v l a for a G srf (r) and / e N are 
i.i.d. random variables with the uniform distribution on [0, 1]. This can be viewed as a hierarchical 
version of the Aldous-Hoover representation ([1], [2], [8], [9]), which corresponds tothecaser= 1. 
Such representation is motivated by the predictions about the structure of the Gibbs measure in 
diluted spin glass models that originate in the work of Mezard and Parisi [12] (see [13] or Chapter 
4 in [15] for a more detailed mathematical formulation). The random variable X a .i represents the 
magnetization of the z th spin in the pure state a, and the tree structure as above stems from the 
ultrametric organization of the pure states in the Parisi ansatz, which was recently proved in [14]. 
The reason why spin magnetizations X U)i are hierarchically exchangeable will be explained in the 
future work, and here we only present the representation result for such arrays, which might be of 
independent interest. 

Finally, although this is not directly related to the results in this paper, an interested reader can 
find a study of another notion of exchangeability on (infinite infinitary) trees in Section III. 13 in 
[2]. 



2 The case of one tree 

It is well known that any standard Borel space is Borel-isomorphic to a Borel subset of [0, 1] (see 
e.g. Section 13.1 in [6]), which means that it is enough to prove Theorems 1 and 2 with random 
variables X a taking values in [0, 1], which we will assume from now on. All the arrays that we 
will deal with will take values in the product space of countably many copies of [0,1], which is 
a compact space. For simplicity of notation, we will continue to denote all such spaces by A. We 
will denote by Pr A the space of probability measures on A equipped with the topology of weak 
convergence, which is also a compact space. If a sequence (X n ) n of A- valued random variables is 
such that the empirical distributions 



1 N 



N 



n- 



converge almost surely to some (Pr A)-valued random variable, then we will call this limit the 
empirical measure of (X n ) n and denote it by <f ((X M ) M ). Our key tool will be the following strong 
version of de Finetti's theorem (see Proposition 1.4, Corollary 1.5 and Corollary 1.6 from [11]). 
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Theorem 3 (de Finetti-Hewitt-Savage Theorem) Suppose (X n ) n is an exchangeable sequence of 
A-valued random variables. Then the empirical measure <^((X„) n ) exists almost surely and has the 
following properties: 

( i) <r>{ (X n ) n ) is almost surely a function of (X n ) n ; 

(ii) given <o((X n ) n ), the random variables X n are i.i.d. with the distribution (^((X„)„); 
( Hi) ifZ is any other random variable on the same probability space such that 

(Z,X U X 2 , ...) = (Z,X n{1) ,X n{2) ,. . .) for all neH x (13) 
then the sequence (X n ) n is conditionally independent from Z given <f((X n ) M ). 

Proof of Theorem 1. The proof will be by induction on r > 1 . For each a € N r ~\ by Theorem 3, 
the empirical measures 

X a :=S{{X an ) n ) GPrA (14) 

exist almost surely, because hierarchical exchangeability (6) implies that (X an ) n is exchangeable 
in the index n for each fixed a. Moreover, hierarchical exchangeability together with Theorem 3 
imply the following: 

(a) Given X a for a fixed a e W~ l , the random variables X an , n e N, are i.i.d. with the distribu- 
tion X a . 

(b) The random variables (X an ) ae ^ r -i are conditionally independent given (X a ) ae ^ r -i . This 
holds because for a chosen a, the joint distribution of all the random variables is invariant if 
one permutes the sequence (X an ) n ^ while leaving all (X a i n ) a i^ a n€ ^ fixed, and so (iii) of 
Theorem 3 gives that the former are conditionally independent from the latter over X a . 

(c) The empirical measures (X a ) aeW -i are hierarchically exchangeable, 

( X x(a))aeW- 1 = i^aeW' 1 for a11 KtH r -\. 
By the induction hypothesis, property (c) yields a representation 

(Xj3)j3eN<-i = ( CT l(( v r)rep(j8)))j3 €N r-i- (15) 

By the properties (a) and (b) and the fact that A is a Borel space, there exists a measurable function 
C2 : Pr A x [0, 1] — > A such that, conditionally on (X a ) aeW -\, 

( X an) aeW -i neN = {^(XccVan)) aeW -\ neN , (16) 

where v an for an E W are i.i.d. random variables uniform on [0, 1], independent from everything 
else. In other words, we simply realize independent random variables X an from the distribution X a 
as functions of independent uniform random variables v an - (See, for instance, Lemma 7.8 in [11] 
for a rather stronger result guaranteeing that this can be done.) Combining (15) and (16) implies 

(X a )aeW = (<7((v i 3) j 8 e p(a))) aeNr 
with <t(jco,xi,. • • ,Xr) '■= 02(<7i(xo,xi, • • • ,x r -i),x r ), which finishes the proof. □ 
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3 The case of several trees 



Theorem 2 will be proved by induction on (r\ , . . . , r£) . Of course, the case i = 1 is already proved in 
the previous section. However, in order to close the induction, it will actually be convenient to focus 
on a more general result, describing //-exchangeable couplings between processes and /-fields, 
defined as follows. We will call an array of random variables {ua)ae^(n,...,rA taking values in 
some compact spaces a I-field if all u a are independent and the distribution of u a depends only on 
the "distance of a from the root", namely, all u a have the same distribution for a G W l x • • • x W e 
for any given (pi , . . . ,pi). We will consider a pair of processes 

(%)ae^(n,..,rf)'(^a)aeN r ix-xN r f! 7 ) 

where (u a ) is a /-field, not necessarily independent of (X a ). We will assume that they are jointly 
hierarchically exchangeable in the sense that 

(( u K(a))ae^(r u ...,r e )i( X n(a))aeWix---xN r e) = (( u a)ae^( ri ,...,r e )i( X a)aeWix-xN r (-) ( i8 ) 

for all bijections it G H n ,...,r t in (8) extended in a natural way to the entire set s^{r\^ . . . ,r£), i.e. 
each coordinate %{ G H n is extended from W' to srf (r ; ) as in (5). For convenience of notation, 
given an array Y a indexed by a G &4{r\ , . . . , r^) and a subset S C =e/(ri, . . . , r^), we will denote 
= (Y a )aeS- F° r example, Y p ^ = (Yp)p ep ( a y The following proposition is a generalization of 
Theorem 2. 

Proposition 1 If (18) holds then there exists a measurable function % such that, conditionally on 
the I-field {u a )ae^{r u ..., n )> 

(Xa)aeWix-xN r e = {^( u p(a)^ v P (a))) aeN n x ... xm , (19) 
where (va)ae^(n,...,r e ) are i-i-d- random variables uniform on [0, 1], independent of (u a ) a . 

Formally, this equality of distribution conditionally on (u a ) a <Es^(r u ...,r e ) means the following equal- 
ity of distribution for larger families of random variables : 

(i u a)ae^(ri,...,r e )i (^a)aeN r ix-xN r ( i ) 

= (i u a)ae^{ ri ,...,r e )i (*( u P {a)> v p(a))) aem x ... xm ) • 

We will generally avoid writing this out in full for the sake of lighter notation. 

Of course, (19) implies Theorem 2 by considering a /-field (u a ) independent of the process 
(X a ). Proposition 1 will be proved by induction on (ri, . . . ,r^) and, in the induction step, we will 
need to describe a conditional distribution of one array given another. We will be able to replace 
this second array with a /-field, and the independence built into the definition of /-fields will be 
well-suited for the induction argument. The induction argument does not work so well when the 
/-field in Proposition 1 is replaced by a general //-exchangeable array (Y a ). However, such a 
generalization, described in Theorem 4 below, will follow once we have Proposition 1. 
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To describe the induction, it will be convenient to write members of srf{r\ , . . . , rg) in the form 
(ft), a), where CO e , . . . , r^_i) and cc £ (r^), and also abbreviate 

^ = ^(n,. . .,r^_i) and jSf = N n x • • • x N^ 1 . 

We therefore write the pair of processes (17) as (u ffl)0 ) ffl€ ^ )0€ ^( r/ ), (Xco,a)ae^,aem- To close the 
induction we will make three separate appeals to simpler cases of Proposition 1, and we subdivide 
the proof into stages accordingly. 

Using the case of one tree 

For the first stage, it will also be convenient to introduce the notation, for each a G W* ', 

X a = (X^X^) = {{U(o,a)ae^i ( X (o,a)(oe.if) > (20) 

which is an element of another compact space, say A = Ai x A%, where X J a take values in Aj for 
j = 1 , 2. If we denote the subarray 

U = (uco,a)o}e^,ae^{r e -l) (21) 

of our /-field consisting of the coordinates that do not appear in (20), then in these terms our goal 
is to describe the joint distribution of (X a ) a eN r e an d U • 

First of all, notice that hierarchical exchangeability in (18) implies that the process (X a )aewe 
is //-exchangeable. Hence, similarly to the proof of Theorem 1, for each a € W e ~ , the empirical 
measure 

X a :=g((X an ) n )eFrA (22) 
exists almost surely and, by Theorem 3, we get: 

(a) given X a for a e N^ _1 , the random variables X an are i.i.d. with the distribution X a ; 

(b) given (X a ) aeWe -i, the random variables (X an ) aem -i neN are conditionally independent. 

Note also that the permutation of the index n for a fixed a does not affect the subarray (21). 
Therefore, part (iii) of Theorem 3 also implies that 

(c) given {X a ) aem -i, the array (X an ) aeWe -i ; „ eN is independent of U . 

Another important observation is that, by the definition of /-field, for any a € W e ~ , the random 
variables X^ n = {u w ^ a ^ in (20) are i.i.d. for n e N with some fixed distribution on A\ and, 
therefore, the marginal of the empirical measure X a in (22) on A \ is this fixed nonrandom measure. 
Together with the property (a) this implies: 

(d) the random variables X^ n for n e N are independent of the empirical measure X a . 

Let us now consider an infinite subset / C N such that I c ' = N \ / is also infinite. Even though 
our goal is to describe the joint distribution of (X a ) a€ we an( l U~, because of the hierarchical 
exchangeability it is, obviously, sufficient to describe the joint distribution of 

( x an) aem -i nel and U . 
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This will be done in several steps, and we begin with the following lemma. We will suppose, 
without loss of generality, that I £ I. We will write ¥(Y e • | Y') for the conditional distribution of 
Y given Y'. 



Lemma 1 (A) The following equality holds: 



n(x an ) 



aeN r e- l ,n<El 



(Xan) 



aem -^(x al e ■ (X an ) neI cj . (23) 



(B) Conditionally on (Xan) ae ^r e -\ n£/e , the arrays (X an ) a€N ^-i n€l and U are independent. 

(C) The arrays (X an ) a€m -\ neI and (X an ) aem -\ nelc are independent. 

Proof. First of all, by property (a), the empirical measure (22) satisfies 

Xa = <?((Xan)nelc), (24) 
which means thatX a is almost surely a function of (X an ) ne jc. Therefore, 



^((Xan) aeN r e -\na 



(Xan) aeN r t -\na c ' U ) 



Using the properties (b) and (c), this conditional distribution is equal to 



P 



((Xan) 



(^a)aeN^- 1 )- 



(25) 



The same computation obviously also works without U , and therefore 



P 



((Xan) 

r((x, 



an>a€Wt- l M€l 



(X an ) 

(X an ) ae m-i e/ , 



This proves (B). Next, using the properties (a) and (b), we can rewrite (25) as (recall that I E I) 



w(x an e -\Xa) 



X r 



which proves that 

P ((^«n)«eN r . 



e-\n€l G 



(Xan) aem -i neIc ^ - ® aeWe -i^(Xal £ • ^a) 
Using (24) and property (a), for any fixed a £ N^ -1 , 



(26) 



P [x al e 



(X an )nel c ) — M^al £ " X a , (X an ) ne ic J — P(X a i G 



X f 



Combining the last two equations proves (A). The last claim follows from (26) and property (d) 
above. □ 
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Using the case of xg/fa—l) 



Now that we have utilized the exchangeability with respect to the permutations of the index n, 
we will change the focus and make the dependence of all random variables on the index (O E srf 
explicit. For each a E N^ -1 , let us denote 



u + 



(u an ,X an ) 



= (Xa,an)nel c for fflGi^, 

= (we),cen)ne/ c for tD£< 

= ((^to,a)ffle^5 (X<a,a)a>e3f)i 

= ((«co,an)(oe^, (XcQ,an)a>e,Z') f° r n E I, 



and let us also denote 



(u,X) := {{u an ,Xan)) aem -i neI - 
With this notation, we can rewrite (23) as 

P((«,X)G • | (C/ + ,X+)) =® a6ir< -iP((«al,Xal)G 



(27) 
(28) 
(29) 
(30) 



(31) 
(32) 



(33) 



We can also rewrite claims (B) and (C) in Lemma 1 as follows: 
(B ; ) conditionally on (U + ,X + ) the arrays (u,X) and U are independent; 
(C ; ) The arrays u and (U + ,X + ) are independent. 

We will now make our first appeal to the inductive hypothesis of Proposition 1 to describe the joint 
distribution of (U + ,X + ) and U . Notice that U£ a in (28) and some of the coordinates u a ,a in U 
in (21) are indexed by CO E srf and a E W e ~ l , so we will combine them and introduce a new array 

U = {Ucoa)o>€^,a€^{n-\) such that 



Ua>,a ■■= (u a , a ,U+ a ) for (DE£f , CC EN n \ 
U(o, a ■= Wfi>,a for COE£/, a Esrf{n-\) \N r < _1 . 



(34) 



Slightly abusing notation, this definition can be written as U = (U , U + ) and it is obvious that U 
is again a /-field. Let us also observe right away that, by property (B'), 



p((k,X) E ■ | (C/+X+)) = p((u,X) E ■ | (E/,X+)) • 

The following gives a description of the joint distribution of (£/ + ,X+) and U . 
Lemma 2 Conditionally on the I-field U = (U~ ,U + ), 



(35) 



(36) 



for some measurable function % of its coordinates, where vp are i.i.d. uniform random variables 
on [0, 1] indexed by x &/(rg — 1). 
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Proof. This is a consequence of the fact that U is a /-field, and the pair U and (Xaa)coe& aeN^ 1 
is, clearly, a hierarchically exchangeable coupling satisfying (18) with replaced by ri — 1. By the 
induction hypothesis, the claim follows. □ 

Let us denote the array of random variables v on the right hand side of (36) by 

V := {v<o,a) a€si / ia€ ^ rt _iy 

Let us denote by S the full map on the right hand side of the equation (36), which can be then 
written as 

X+ = Z(V,U). 

Since all our random variables take values in standard Borel (or even compact) spaces, we can 
consider the regular conditional probability 



(u,z(v,u))= x y 



»(.|*) = p(Vg 

It is a standard fact in this case that if /i is the law of (U,X + ) then, for jU-almost all x, 

p({V|(£/,E(V,£/))=*} |*) = 1- 



(37) 



(38) 



Now, using this conditional probability, let us couple the arrays (u,X) in (32) and V conditionally 
independently given (U,X + ), 



P((«,X),VG • (U,X+)=xj 

= p((m,x)g- (t/,x + )=x) xp(y e • (c/,x + )=x). 



(39) 



This is a standard construction in probability, as well as in ergodic theory, where it is called a 
'relatively independent joining': see, for instance, the third example in Section 6.1 of Glasner [7]. 
The triple 

(u,X),V and {U,X + ) 

is still hierarchically exchangeable, since this is true separately of both conditional distributions on 
the right hand side of (39) (for a much more detailed explanation see Lemma 2.3 in [10]). Having 
done this, we may henceforth regard all of these processes as defined on the same background 
probability space. 

Lemma 3 With the joint distribution constructed above, 

P(( M ,X) G • | (f/ + ,X+)) =P(( Mj X) G • | (V,t/)). (40) 

Remark. Notice that this implies that the property (C') above can now be written as: 
(C") the arrays u and (V,U) are independent. 



10 



Proof of Lemma 3. By (38), X + = E(V,U) with probability one, so X + is almost surely a function 
of V and U. Therefore, 



P 



((«,x)e • |(v,t/))=p(( M ,x)e • |x + ,(v,t/)) 



(v«,t/«)). 



By the construction (39), and V are conditionally independently given (U,X + ), so this con- 

ditional distribution is equal to P((w,X) £ • | (£/,X + )), and (35) finishes the proof. □ 

Thus, we have replaced the conditioning on (U + ,X + ) on the left hand side of (33) with condition- 
ing on (V,U), and now we will do a similar substitution in each factor on the right hand side of 
(33). Recall the notation £/+ and X+ in (29) and, for each a £ N r * -1 , let us denote 

VaHvp^a))^ and U a \=(U p ^ a) ) ^ . (41) 

Notice that one factor on the right hand side of (33) is P((w a i,X a i) £ • | and we will 

now show the following. 

Lemma 4 For each a £ N r * -1 , we /zav<? 

P((w«l,X«l)e •|(C/+,X+))=p(( Mal ,X al )£ • fv' a .C/,,))- (42) 
Proof. First of all, the equation (33) implies that 

p(( m „i,x„i) e • | (u+,x+)} =p(( m «i,x i) £ 

which can be seen by considering the probabilities of cylindrical sets that depend only on {u a \ ,X a \). 
Using (40), we get 

P((w«l,Xal) £ • | (C/+,Z+)) =p(( Mal ,X al ) £ • | (V,t/)). (43) 
We saw in the proof of Lemma 3 that X + = E(V, [/) with probability one and, therefore, 

= (^ r £o,a)toe,5 tf = i v p{a>,a)^p{co,a) 

Using this and the fact that, by (34), £/+ is also a function of £/ a , we obtain the following inclusion 
of c-algebras, 

o(U+,X+)Co(V a ,U a )Co(V,U). 

The equality of conditional distributions in (43) given the two extreme c-algebras implies the 
equality to the conditional distribution given the middle c-algebra, and this finishes the proof. □ 

The preceding two lemmas allow us to rewrite (33) as 

P(( M ,X) £ • | (V,C/)) =® a6N7 -,P((«al,Xal) G • | (V^Ua))®' ■ (44) 

In other words, conditionally on (V,U), the random variables (u an ,X an ) are independent for all 
a £ W f ~ l and n £ /, and for a fixed a, have the same distribution, 



P((«al,^ol)e 



(Va,£/ C 



for all n £ /. By the property (C") above, u a \ is independent of (V a , U a ), so our main concern now 
is to describe the conditional distribution of X a \ given u a \ , V a and £/«. 
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Using the case of £ — 1 trees 

Lastly, we will use the induction hypothesis in Proposition 1 to describe the joint distribution of the 
processes X a \,u a i,V a and U a for a fixed a G N r< , so these are indexed by srf = srf{r\, . . . ,r^_i). 
The process X a \ consists of the random variables X W)a i indexed by ft) G Jz? . We will view the triple 
("eel > Va,U a ) as a new /-field that consists of the random variables 

Tg := (u m , a i , {v( a ,p)) p €p[a) , { u ((o,p)) p €p{a )) ^ 

indexed by ft) G Here, we relabeled the random variables by collecting all the coordinates of 
£/« and V a that depend on a fixed ft) G .c/. By the property (C") above, the array T a := (T®) me ^ 
is again a /-field, and it is clear that it forms a hierarchically exchangeable coupling with the array 
X a \. The induction hypothesis in Proposition 1, now used with ri = 0, implies the following. 

Lemma 5 There exists a measurable function % such that, conditionally on T a , 

{Xco,al)a, € j? = ( T ( w p(«)>%))))a>eif> < 46) 

where w is an array of Ltd. random variables uniform on [0,1] indexed by CO G srf , independent of 
everything else. □ 

This allows us to finish the proof of Proposition 1. First of all, let us notice that we can write 

T p(ca) = ( u p(co)x{al}i v p(co,a)M P (co,a)) ■ 

Combining Lemma 5 with (44), we proved that, conditionally on the arrays u,V and U, we can 
generate the random variables X a ^ an for ft) G srf , a G W l ~ l , n G / in distribution by 

Xa.an = T '(v p(co)x{an} > u p(co)x{an} > v p(co,a)^p(co,a)) ■> (47) 

where, for each a G W l ~ l and n G /, we used the random variables v p ( ffl ) x { a „i. in place of an 
independent copy of w p ^ in (46). First of all, 



{ v p(a>)x{an}> v p(a>,a)) = V P ( 



an, CO) 



If we recall the definition of the process U in (34), we see that for a G W £ 1 , U p ^ ma ^ consists of 
two parts, u p ( a a j and lT^ ffl a y and the first one can be combined with w p ( ffl ) x {arc} to give 

{ u p{co)x{an} i u p(co,a)) = u p(an,co)- 
Then, (47) can be rewritten as (slightly abusing notation) 

Xco,an = ?{u p (an,a)i v p(an,co),U p ( an ^))- (48) 

Finally, note that we consider the random variables X m an with the index n G /, while all the random 
variables U£ a m (28) were defined in terms of the random variables u m:an with the index n G I c . 
Therefore, we can combine v p ( an£0 ) and U^ an ^ in (48) and view them as one single /-field 

12 



independent of all the random variables u p i an (0 \ with the index n £ I. This completes the induction 
step in Proposition 1, and finishes the proof of Theorem 2. □ 

One can also now formulate a conditional version of Theorem 2 as follows. Examples of H- 
exchangeable pairs of processes can be constructed in the form 

u p(a)i v p(a) )), (49) 

for two measurable functions 0\ , 02 and independent /-fields u and v of uniform random variables 
on [0,1]. 

Theorem 4 Any hierarchically exchangeable array of pairs (Y a ,X a ) a<E T^r l x ... x ^ can be gener- 
ated in distribution as in (49) for some measurable functions 0\ and 02. 

Proof. This follows by first applying Theorem 2 to represent 

(Y a )a = (<7l ("/>(«))) a , 

then forming the coupling of the processes X and u conditionally independently over Y, and then 
applying Proposition 1 to represent the joint distribution of (u,X). □ 
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A hierarchical version of the de Finetti and 



Aldous-Hoover representations. 



Dmitry Panchenko 51 



Abstract 

We consider random arrays indexed by the leaves of an infinitary rooted tree of finite depth, 
with the distribution invariant under the rearrangements that preserve the tree structure. We call 
such arrays hierarchically exchangeable and prove that they satisfy the analogue of de Finetti's 
theorem. Then, under additional standard exchangeability with respect to the second index, we 
prove a hierarchical version of the Aldous-Hoover representation. 



Key words: exchangeability, spin glasses. 
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1 Introduction 

The subject of exchangeability is prevalent in probability theory (see e.g. 0, Chapters 7-9 in 
(H, or and [HI for recent overviews) and the goal of this paper is to study another notion of 
exchangeability that is motivated by spin glass models. We will consider two types of random 
arrays — (X a ) indexed by a E W for some integer r > 1 and (X a ,i) indexed by a E W and i E N 
— whose distributions are invariant under certain rearrangements of the indices. We will think of 
N r as the set of leaves of a rooted tree (see Fig. [T) with the vertex set 

^ = N°UNUN 2 U...UN r , (1) 

where N° = {0}, is the root of the tree and each vertex a = (n\ , . . . , n p ) E N p for p < r — 1 has 
children 

an := (n\,...,n p ,n) E W +l 
for all nGN. Therefore, each vertex a is connected to the root by the path 

— > n\ — )• (721,^2) — >• > (ni, ■■■ ,n p ) = oc. 

We will denote the set of vertices in this path by 

p(a) = {0,ni,(ni,n 2 ), •••,("!, •••>"/>)}• ( 2 ) 
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Figure 1 : Index set N r as the leaves of the infinitary tree s# '. 



We will consider rearrangements of N r that preserve the structure of the tree srf ', in a sense that 
they preserve the parent-child relationship. More specifically, we define by 

aAj3:=|p(a)np(j3)| (3) 

the number of common vertices in the paths from the root to the vertices a and /3, and consider 
the following collection of maps on W, 

H = [n : N r -)> N r | % is an injection, %{ol) A tt(j(3 ) = a A/3 for all a,j8 £ N'"}. (4) 

Any such map can be extended to the entire tree srf in a natural way: let 7r(0) := and 

if 7f((ni,...,n r )) = (mi,...,m r ) then let 7r((ni, . . . := (mi,...,m p ). (5) 

Because of the condition %[a) A tt(/3 ) = a A /3 in ©, this definition does not depend on the 
coordinates n p+ \,. . . ,n r , so the extension is well-defined. It is clear that such extension preserves 
the parent-child relationship and, for each a £ &/\W, iz{an) = n(a)n a {n) for some injection 
% a : N — > N. In other words, the condition n{a) A tt(J3) = a A /3 means that we can visualize the 
map 7T as a recursive procedure, in which children an of the vertex a £ W are mapped into distinct 
children of the image n(a) £ N p . The image of the entire tree, iz{srf), can be viewed as a "copy of 
the tree srf" inside s$ '. 

We will say that an array of random variables (X a ) ae ^ r is hierarchically exchangeable, or 
//-exchangeable, if 

{ X n(a))aew = ( X(X ) aeW ^ 
for all % £ H. Throughout the paper, we will view any array of random variables as an element 
in the product space, so the equality in distribution is always in the sense of equality of the finite 
dimensional distributions. The case of r = 1 corresponds to the classical notion of an exchangeable 
sequence, and in the general case of r > 1 we will prove the following analogue of de Finetti's 
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classical theorem. One natural example of an //-exchangeable array is given by (recall the notation 
in©) 

Xa = o{(vp)p ep ( a )), (7) 

where a : [0, l] r+1 — > R is a measurable function, and v a for a E srf are i.i.d. random variables with 
the uniform distribution on [0, 1]. The reason this array is hierarchically exchangeable is because, 
by the definition of %, the random variables v n (a) for oc E are also i.i.d. and uniform on [0, 1], 
p{n{a)) = n(p(a)) andX ff(cf) = <7((v^( j6) ) j8ep ( a) ). We will show the following. 

Theorem 1 Any hierarchically exchangeable array (X a )aeW can be generated in distribution as 
in ftZtyfor some measurable function o. 

This result is almost obvious and can be proved by a simple induction "from the root down". 
However, a similar approach does not seem to work in the hierarchical version of the Aldous- 
Hoover representation, which is the main motivation for this paper (at least, it is not clear at the 
moment how to make it work). Because of this, to prove Theorem [T} we will develop an induction 
argument "from the leaves up", which will then be adapted to obtain the following. 

Let us consider an array of random variables (X a j) indexed by a E W and i E N. Again, we 
will say that it is hierarchically exchangeable, or //-exchangeable, if 

{ X n{a),p(i))aeW,ieN = i Xa j) aeWJeN ^ 

for all % E H and all injections p : N — > N. The case of r = 1 corresponds to the classical notion of 
an exchangeable array and, by analogy with ©, a natural example of such array is given by 

X a ,i = Cr((v j8 ) i3ep ( a ),(vj3) j 3 ep ( a )), (9) 

where a : [0, l] 2 ( r+1 ) — y R is a measurable function, and all v a and v l a for a E s# ,i E N are i.i.d. 
random variables with the uniform distribution on [0,1]. We will prove the following analogue of 
the Aldous-Hoover representation ([HJ, [0, fl6), [|7)) in the general case r > 1. 

Theorem 2 Any hierarchically exchangeable array (-^a,i)aeN r ,/eN can be generated in distribution 
as in &for some measurable function o. 

This hierarchical version of the Aldous-Hoover representation is motivated by the predictions 
about the structure of the Gibbs measure in diluted spin glass models that originate in the work of 
Mezard and Parisi [9 1 (see ffTOl or Chapter 4 in [fl2l for a more detailed mathematical formulation). 
The random variable X a ,i represents the magnetization of the z th spin in the pure state a, and the tree 
structure as above stems from the ultrametric organization of the pure states in the Parisi ansatz, 
which was recently proved in [fTD . The reason why spin magnetizations X a i are hierarchically 
exchangeable will be explained in the future work, and here we only present the representation 
result for such arrays, which might be of independent interest. 

Let us immediately point out a standard observation that will also be used implicitly a number 
of times throughout the paper. Recall that a measurable space {£L,3§) is called a Borel space if 
there exists a one-to-one function <p from Q. onto a Borel subset AC [0, 1] such that both (p and 
<p _1 are measurable. For example, it is well known that all complete separable metric spaces and 
their Borel subsets are Borel spaces (see e.g. Section 13. 1 in j5]|). The existence of the isomorphism 
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(p automatically implies that if we can prove the representations of Theorems [T] and |2] in the case 
when X a and X a ,i take values in [0, 1] then the same representations hold when these random 
elements (and functions o) take values in a Borel space. 

Finally, although this is not directly related to the results in this paper, an interested reader can 
find a study of another notion of exchangeability on (infinite infinitary) trees in Section III. 13 in 

2 Proof of Theorem U 

Theorems [T] and |2] are proved very similarly to their classical counterparts. The main difference is 
to find the appropriate symmetries, and we begin with the easier case of the hierarchical de Finetti 
representation. It is important to understand the argument in Lemmas Q] and [2l since we will not 
repeat similar arguments in the proof of Theorem |2] 

As we mentioned above, the case r = 1 is the famous de Finetti's theorem, and we will prove 
the general case by induction on r. First of all, by ©, it is obviously enough to prove the represen- 
tation © for a G (2N) r only, i.e., for indices with all even coordinates. Given /3 = {n\, . . .,n r _i) G 
(2N) r ~ 1 , let us consider the array (see Fig. |2]) 

T P = { x a)ael(P)i 0°) 

where 

7(/3) = {a = (ni,.. .,n p -\,m p ,.. . ,m r ) G N r | 1 < p < r and m p ,.. .,m r G 2N— l}. (11) 

In other words, the index set 7(/3) consists of all a = (mi , . . . , m r ) G N r such that aAf5=p implies 
that m p , . . . ,m r are odd. Given n G 2N and a = fin, it is obvious that the subtree connected to the 
leaves in {a} U/(/3) is a copy of stf ', in a sense that there exists n G H defined in © such that 
^(N r ) = {a} U/(/3); we can also ensure that n(a) = a. As a consequence of this, we will obtain 
the following key result. For a subset / C N r , we will denote by &j the a-algebra generated by 

Lemma 1 For any /3 G (2N) r_1 ,n G 2N and (X = f5n, the conditional expectations 

E(f(X a )\^ WX{a} ) =E(f(X a )\^ m ) 
almost surely, for any bounded measurable function f : R — > R. 

Proof. If 7t G H is such that 7t(W) = {a} U/(/3) and 7t(a) = a then © implies the following 
equality in distribution, 

E(/(X a )|j% rUa} ) ±E(f(X a )\& m ), 
and, therefore, the equality of the L 2 -norms, 

||E(/(X a )|^ Nr \ {a} )|| 2 = ||E(/(X a )|J^ (j8) )|| 2 . (12) 
On the other hand, 7(/3) C N r \ {a} and ^i(p) ^ <^w\{a}> which implies that 

E(E(f(X a )\j? m )E(f(X a )\j? w \ {a} )) =E(E(f(X a )\^ m ) 2 ) = \\E(f(X a )\j? m )\\ 2 2 . 
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Figure 2: Here 05 = fin G (2N) r , and dotted edges represent coordinates in 2N — 1. The points 
(leaves) in the set Z(j8) that index the array Tp are connected to the path p(/3) by odd coordinates 
only. 



Combining this with (fT2|) yields that 

\\E(f{Xa)\^ {a} )-E(f{X a )\& m )\\l = 0, (13) 
and this finishes the proof. □ 

Next, we will prove the following consequence of Lemma CD Consider integer k > 1 and, for 
j < k, consider arbitrary /3/ G (2N) r_1 ,n/ G 2N and let 05/ = jS/ny G (2N) r . We suppose that all 
05/ 's are distinct. 

Lemma 2 For a/ry bounded measurable functions f\,...,f n : R — )■ R, 

E (n/;(^) (i»p 6( 2N)-0 = ri E te(^)|7>.) (U) 

almost surely. 

Proof. We need to show that for any event A in the a-algebra generated by (7/3)j3e(2N) 



r-l, 



EI A Hfj(X aj ) =EI A HE(f j (X aj )\Tp j ). 

j<k j<k 

Since I A and f\ (X ai ),..., A_i(X ajtl ) are J^^-measurable, we can write 

E/aIIW = II fj( X ajMfk(Xa k )\^W\{a k} ) 

j<k j<k-\ 

= EI A J] fj(Xaj)E(MX ak )\Tp k ), 
j<k-\ 



where the second equality follows from Lemma [U since <^up k ) is the a-algebra generated by Tp k . 
This implies that 



j<k 7<fc-l 

and (PT4l) follows by induction on k. □ 

Lemma[2]means that, conditionally on (Tp)Q e /^y-i, the random variables X a for a G (2N) r 
are independent and the distribution of X a for a = fin depends only on Tp . Moreover, by symmetry, 
the distribution of X a given Tp does not depend on a or /3 . Therefore, given (Tp ) p^^ny- 1 » we can 
generate the array (•2fa)ae(2N) r m distribution by 

^a = /(v«,r /3 )fora = /3n, (15) 

where (v a ) are i.i.d. random variables with the uniform distribution on [0, 1], also independent of 
(Tp), and / is some measurable function of (v a ,Tp). To finish the proof of Theorem [T] it is enough 
to show that we can generate (Tp)p <E ( 2 ny- 1 m distribution by 

Tp=g{(v r ) rep{ p ) ), (16) 

for some measurable function g on [0, l] r with values in the space of the arrays Tp . This will follow 
by the induction assumption if we can show that (Tp)p^ 2 fqy-i is hierarchically exchangeable in 
the appropriate sense, i.e., with respect to the collection of maps 

H* = {n* : (2N) r_1 (2N) r " 1 1 n* is an injection, n* ( a) A 7T*(/3) = a A/3 for all a,j8 £ (2N) r " 1 }. 

(17) 

If we can prove that for any n* G H*, 

( 7 ^*(j8))j3e(2N)'- 1 = ( 7 j8) J 8e(2N) r - 1 ' ( 18 ) 

then (fT6l) will follow from Theorem [T] with r replaced by r — 1 and N replaced by 2N, since random 
elements Tp take values in the Borel space. To prove (fT8l ). we will show that for any n* G i/* one 
can find G // such that (recall (flOl) ) 



^(jS) = (^a)ae/(jr*(0)) = (^(a))ae/(/3)- ( 19 ) 

Clearly, the existence of such % together with © would imply (fT8l) . It should be almost obvious 
how to construct such %. First of all, given n* EH*, similarly to ©, let us define n*(%) := and 

if 7T*((ni,...,n r _i)) = (mi,...,m r _i) then let jp*((/ii,...,n p )) := (mi,...,m p ), (20) 

for /? < r- 1. As in ©, since ft* (a) A 7T*(j3) = a A/3, 7T* is well-defined on N p for all p < r- 1. 

Consider any a = («i , . . . , n r ) G N r and let p be the first index such that n p G 2N — 1 if such 
odd coordinate exists; otherwise, we let p = r. Then, we define 

jc(a) := (mi,...,m p -i,n p ,...,n r ) if n*{{n x , . . . ,n p -\)) = (mi, . . . ,m p _i). (21) 
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First of all, it is a simple exercise to check that it G H, i.e., 7t(a) A tt(j8) = a A/3. Indeed, suppose 
that only the first p — l coordinates of a and /3 are common and even. Then, obviously, they will 
stay common after the map it. If the pth coordinates of a and /3 are also equal then they must 
be odd and, by the definition (|2T|) . all the coordinates starting from the pth coordinate will be left 
unchanged, which implies that it(a) A ?r(j3) = a A j8 . If the pth coordinates of a and /3 are different 
then the pth coordinates of it (a) and 7r(j8) will also be different: this is obvious when one of the 
coordinates is odd and follows from ([FTl when both coordinates are even. 

It remains to check that (fT9l ) holds. Let us fix j8 = (n\, . . . ,« r -i) G (2N) r ~ 1 and suppose that 
7T*(/3) = (mi, . . . ,m r _i). Then, by (f2~Tj) . for any I <p <r and any Z p , . . . ,/ r G 2N — 1, 

a = (ni,...,n p -i,l p ,...,l r ) -t=^ it(a) = (mi,...,m p -i,l p ,...,l r ). 

Recalling the definition CCD, this means that I(n*(J3)) = Jt(I(f$)) and, hence, £[9]> holds. This 
finishes the proof of Theorem [T] □ 



3 Proof of Theorem [2] 

Again, the proof is by induction on r, but finding appropriate symmetries is now more involved. 

Step 1. Again, it is enough to prove the representation © for a G (2N) r and i G 2N. Let us fix 
a = fine (2N) r and i G 2N. Recall the definition of the set Z(j8) in (flTT) and consider a subarray of 
random variables (see Fig. |3]) 

X 7J for ye {a} U/(j8) and j G {i} U (2N- 1), (22) 

which we will split into three parts, 

x a,i, Yp, n = {Xaj) je2 s-i and Z P,i = ( Z 7j) 7 6/(j3)Je{/}u(2N-l)- (23) 

This subarray is, obviously, a copy of the entire array (Xyj)ysw jeN m a sense that there exists 
% G H and an injection p : N -)■ N such that ^(N r ) = {a} U/(j8), p(N) = {/} U (2N- 1) and 
it {a) = <X,p(i) = i. Using the hierarchical exchangeability property © and the fact that (|22l) is a 
copy of the entire array, the argument identical to the proof of Lemma \T\ implies that 

E(f(X aji )\(X 7j ) {Y}maji) ) = E(f(X aji )\Yp^Z N ) (24) 

almost surely, for any bounded measurable function / : R — > R. Then, the argument identical to 
the proof of Lemma |2] implies that, conditionally on all the random elements Ya n and Z« for 
/3 G (2N) r ~\n G 2N and j G 2N, the random variables X a ,i are independent for all a G (2N) r 
and «' G 2N, and the distribution of X a j for a = /3n depends only on n and Z« j. Moreover, by 
symmetry, this distribution does not depends on a and i. Therefore, conditionally on all Yp n and 
Zp h we can generate X a i in distribution by 

X a>i = /(v^Yp^Zpj) for a = j8n, (25) 

where (v' a ) are i.i.d. random variables with the uniform distribution on [0, 1], also independent of 
all Yn n and Zp h and / is some measurable function of its arguments. 
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Figure 3: Given <x = fin E (2N) r , the tree is the same as in Fig. [2] Dashed interval attached to each 
leaf vertex represents the set of indices i E 2N — 1, and a dot represents a fixed even index i E 2N. 

Step 2. Let us now describe what kind of symmetries are satisfied by the collection of random 
elements Yr„ and Z pi for /3 E (2N) ,n E 2N and i E 2N. Let us consider arbitrary injections 
p* : 2N -> 2N and n* E H* in £F7]). Suppose that for each /3 E (2N)' _1 we are given an injection 
Kp : 2N — » 2N. Let us now define a map % E H as in (|20l) and (|2T|) . with one modification, 

rc(a) := K*(l5)K*p(n) for a = pnE (2N) r . 

It is clear that 71 E H in ©. Let us also extend p* to an injection p : N — > N by defining p(j) = j 
for E 2N — 1. It is easy to see that for /3 G (2N) r ~ 1 ,n G 2N and i E 2N, 

^*(j8),^*(«) = (^r*03)rc*(n)j) 7 - € 2N-l = (^(«)j) je2N-l 

and, as in (TT9b . 

Z n*(J5),p*(i) = ( x rJ) r e/(w*(j8))j6{p*(i)}u(2N-l) = (^(r),P0'))ye/(j8) Je{i}u(2N-l)" 
Therefore, the exchangeability property (|8) implies that 

( } ^*(j3),^W 5 - Z ^*(/3),p*(;-))/3e(2N)'- 1 ,ne2N, ( -e2N = (^,"'- Z P I '')/3e(2N) r - 1 ,ne2N,!e2N- ^ 26 ^ 
In the next step, we will show that (|26|) implies that these array can be generated in distribution by 

Yp, n = Y ((vy) r €p(p),vp n ) and Zp ti = Z((v r ) rep(/3) , (v^) yep(j8) ) (27) 
for some measurable functions Y and Z. Plugging this into (1251) will finish the proof of Theorem[2] 
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72€N 




Figure 4: Here /3 G (2N) and dotted edges represent coordinates in 2N — 1, so the leaves of 
the tree are in Z'(j8). Dashed intervals attached to each leaf y G /'(/3) represent (Yy^ n ) ne 2N-i and 
(Zyj) j e zM-i- Dots represent Fn „ and Zh ,• for some choices of n G 2N and j G 2N. 



Step 5. To prove that (1261) implies (|27T) . let us, for simplicity of notation, replace 2N by N, i.e., 
suppose that the arrays in (1261 ) are indexed by /3 G W~ l , n G N, / G N and 

( I ^(j3),^(n)5^(j3),p(f))j36N r - 1 ,n6N,i6N = ( I ^,n' Z j3 ) i)j36N r - 1 ! n6N,i6N ^ 28 ^ 

for any injections 7T : N r_1 ->■ N r_1 such that x(a) A 7t(j5) = a A j8, TTs : N ->• N for all j8 G 
and p : N — > N. At this point, we will forget that in the previous step Ya n and Z« t were some 
arrays of random variables, and we will treat them simply as real-valued random variables. Again, 
the reason we can do this is because, once we prove that (|28l ) implies (|27T ) for random variables, the 
result automatically follows for random elements taking values in Borel spaces. As usual, by the 
assumption (|28l ), it is enough to prove the representation (1271 ) for /3 G (2N) r_1 ,n G 2N and i G 2N. 
Given /3 = (ni, . . . ,« r _i) G (2N)' _1 and the index set 

/ , (j8) = {j8}u{(ni,...,n p _i,mp,...,m r _i)|l </7<r-landm^,...,m r _i G2M-1}, (29) 

let us consider the subarray (see Fig. SJ 

7)3 = ( } T,«' Z yj) re /'(/3),, 3 e2N-Lje2N-r ( 30 ) 

There are two reasons why we consider this array, as we shall now explain. 

First of all, given any n G 2N, the array (Y« „,7r) is, obviously, a copy of the entire array 
on the right hand side of (1281) in a sense that one can find Tt, (%) rG Nr-i and p such that the left 
hand side of (|28l) consists of the random variables in the array (Ya n ,Tp). Then, the argument in the 
proof of Lemmas [T]and |2] implies that, conditionally on (7js)/3 e (2N) / '- 1 ' me random variables Yp n 
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are independent of each other, as well as of Za it for all /3 G (2N) r ,n G 2N and i G 2N, and can 
be generated in distribution as 

Yp, n = g(vp n ,Tp) (31) 

for some measurable function g of its arguments. 

The second reason to introduce the array Tp is because we can describe the joint distribution 
of Tp and Z Pji for all j8 G (2N)''" 1 and i G 2N, using the statement of Theorem |2] with r replaced 
by r — 1. In other words, this is where the induction on r will play its role. Given /3 G (2N) r_1 and 
i G 2N, define 

U p>i =(T p ,Z p>i ). (32) 
Consider any 7T* G H* defined in (fTTT ) and an injection p* : 2N — » 2N and let us show that 

( C/ ^*(j3),p*(/))j3e(2N)'- 1 J i62N = ( C/ j8,0j36(2N)'- 1 ,je2N- ^ 33 ^ 

Let us extend p* and n* to injections p : N -> N and 7T : N' _1 -> such that rc(a) A ttQS) = 
a A/3, as follows. We let p(i) = p*(i) for / G 2N and p(i) = z for i G 2N- 1. We let rc(j3) = rc*(j3) 
for j8 g (2N)'"- 1 and define tt(/3) for j8 G N r \ (2N)''- 1 as follows. If j8 = (m, . . . ,n r _i) and n p is 
the first coordinate in 2N — 1 then, recalling the definition (l20l . we define 

tt(j8) := (mi,...,m p -i,n p ,...,n r -i) if 7f*((m,...,n p _i)) = (mi,...,m p -i). (34) 

Finally, for all /3 G N we define Ttp(n) = n. It is easy to check that with these choices of p, % 
and (np), for any /3 G (2N) r ~ 1 and i G 2N we have Z n *^p^ p *^ = Z K ^p^ p ^ and 

T ic*(fi) = ( i y,«'- Z rj) r e/'(^*(/3)),, 1 e2N-lje2N-l = ( i ^(r),^3(«)' Z ^(r) I P(i))re/'(j3),»e2N-lj62N-r 

Therefore, it is clear that (|33l is a consequence of ((28). Of course, (|33l is exactly the hierarchical 
exchangeability property dHJ with r replaced by r — 1 and the index set N replaced by 2N, and the 
induction assumption implies that we can generate the array (Up^j R e / 2N )r-i ( - e2N in distribution by 

u pj = h (( v r)re P (P)^( v r)yep(i3))- ( 35 ) 

If we write h = (hi,h2), where h\ and h 2 correspond to the coordinates Tp and Zp j in (|32|) . then it 
is obvious that /zi does not depend on the coordinates (v r ) rep (j3) and, therefore, 

Tp =/i 1 (( Vy ) yep(j8) ) and Zpj = h 2 ({v r ) rep ^, (v l y ) jep ^). 
Combining this with (I3TI) implies ([27]) and finishes the proof of Theorem |2] □ 
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